EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis
نویسندگان
چکیده
This paper describes EmoTweet-28, a carefully curated corpus of 15,553 tweets annotated with 28 emotion categories for the purpose of training and evaluating machine learning models for emotion classification. EmoTweet-28 is, to date, the largest tweet corpus annotated with fine-grained emotion categories. The corpus contains annotations for four facets of emotion: valence, arousal, emotion category and emotion cues. We first used small-scale content analysis to inductively identify a set of emotion categories that characterize the emotions expressed in microblog text. We then expanded the size of the corpus using crowdsourcing. The corpus encompasses a variety of examples including explicit and implicit expressions of emotions as well as tweets containing multiple emotions. EmoTweet-28 represents an important resource to advance the development and evaluation of more emotion-sensitive systems.
منابع مشابه
Exploring Fine-Grained Emotion Detection in Tweets
We examine if common machine learning techniques known to perform well in coarsegrained emotion and sentiment classification can also be applied successfully on a set of fine-grained emotion categories. We first describe the grounded theory approach used to develop a corpus of 5,553 tweets manually annotated with 28 emotion categories. From our preliminary experiments, we have identified two ma...
متن کاملExposing a Set of Fine-Grained Emotion Categories from Tweets
An important starting point in analyzing emotions on Twitter is the identification of a set of suitable emotion classes representative of the range of emotions expressed on Twitter. This paper first presents a set of 48 emotion categories discovered inductively from 5,553 annotated tweets through a small-scale content analysis by trained or expert annotators. We then refine the emotion categori...
متن کاملThe Constitution of a Fine-Grained Opinion Annotated Corpus on Weibo
Sentiment analysis on social media represented by Weibo is one of the hotspot research problems in NLP. A comprehensive and systematic fine-grained annotated corpus plays a significance role. In this paper, considering the characteristics of Weibo, we focus on the constitution of a fine-grained, hierarchical opinion annotated corpus and design a set of labelling specification. We manually annot...
متن کاملAnnotation, Modelling and Analysis of Fine-Grained Emotions on a Stance and Sentiment Detection Corpus
There is a rich variety of data sets for sentiment analysis (viz., polarity and subjectivity classification). For the more challenging task of detecting discrete emotions following the definitions of Ekman and Plutchik, however, there are much fewer data sets, and notably no resources for the social media domain. This paper contributes to closing this gap by extending the SemEval 2016 stance an...
متن کاملA Topic Model for Building Fine-grained Domain-specific Emotion Lexicon
Emotion lexicons play a crucial role in sentiment analysis and opinion mining. In this paper, we propose a novel Emotion-aware LDA (EaLDA) model to build a domainspecific lexicon for predefined emotions that include anger, disgust, fear, joy, sadness, surprise. The model uses a minimal set of domain-independent seed words as prior knowledge to discover a domainspecific lexicon, learning a fine-...
متن کامل